Use of expression profiling for identifying molecular markers useful for diagnosis of metastatic cancer

ABSTRACT

A method is presented that will identify molecular markers useful for detecting metastasized tumors in mammals. The method comprises identifying candidate molecular markers that are associated with terminal differentiation in the tissue in which a tumor arises, and identifying candidate molecular markers that continues to be expressed in the tumors from that tissue but not in the biopsy tissue.

CROSS REFERENCE TO RELATED PATENT APPLICATIONS

[0001] This application claims priority to U.S. Provisional ApplicationNumber 60/192,229 filed Mar. 27, 2000, which is incorporated herein byreference.

FIELD OF THE INVENTION

[0002] The present invention relates to methods to isolate molecularmarkers that may be used to detect metastasized tumor cells.

BACKGROUND OF THE INVENTION

[0003] Throughout the specification references are cited in brackets.These references are incorporated by reference in their entirety todescribe the state of the art.

[0004] Cancer represents a significant worldwide health problem. Canceris an uncontrolled growth and spread of cells. For many cancers,metastasis to adjacent or distant tissues results in physiologicimpairment and often death. Early diagnosis and the ability to diagnosismetastasis of primary tumors represent significant challenges in theeffective treatment of neoplastic disease.

[0005] Stage at diagnosis is the single most important prognosticdeterminant for patients with cancer and dictates the role of adjuvantchemotherapy in this disease. Given the prognostic and therapeuticimportance of staging, accurate histopathologic evaluation of lymphnodes to detect invasion by cancer cells is crucial. Specific diagnosisof cancer metastasis is currently preformed by histologic and cytologicresemblance to normal tissue. Cancer cells frequently maintain theirphenotypic characteristics of their normal cell of origin.

[0006] However, conventional microscopic lymph node examination hasmethodological limitations. Differentiation of single or even smallclumps of tumor cells from other cell types can be difficult, limitingsensitivity. The standard practice of examining only several tissuesections from each lymph node can omit from review >99% of eachspecimen, introducing sampling error. These limitations are evident whenthe frequency of recurrence in patients with stage I and II colorectalcancer is considered. By definition, these patients do not haveextra-intestinal disease at the time of curative resection. However,recurrence rates of 10% to 30% for lesions confined to the mucosa (stageI) and 30% to 50% for lesions confined to the bowel wall (stage II) havebeen reported.

[0007] Alternative methods to detect small numbers of tumor cells havebeen applied to staging, including intensive review of serial tissuesections, PCR to detect tumor-specific mutations, immunohistochemistryor and RT-PCR to detect the expression of biomarkers that arespecifically expressed in cells that have undergone neoplastictransformation (Sloane, 1995, Lancet 345: 1255-6; Abati and Liotta,1996, Cancer 78: 10-66). In some colorectal cancer studies, staging bythese sensitive methods has correlated with disease. However, the labor-and cost-intensity of serial sectioning, the lack of uniform associationbetween mutations and neoplastic transformation, and the lack ofspecificity of many biomarkers limit the applicability of these methods.

[0008] Easily detected molecular markers that are uniformly expressed bylarger numbers of metastasized tumor would therefore be useful formetastasis detection and disease staging. Particularly needed ismethodology to isolate useful molecule markers for the detection ofmetastatic tumor cells in tissues and/or bodily fluids. Such methodologywould ideally be high throughput and utilize established robustprotocols.

SUMMARY OF THE INVENTION

[0009] The present invention relates to a method for the isolation oftissue-specific molecular markers that are useful in the diagnosis ofmetastatic cancer.

[0010] One aspect of the invention is a method to identify molecularmarkers useful for detecting tumor cells that have metastasized from anorigin tissue to a destination tissue or fluid. The method comprises thesteps of down-regulating in a population of origin tissue cells theactivity of a transcription factor associated with terminaldifferentiation in the origin tissue, comparing an expression profile ofthe population of down-regulated origin cells with an expression profileof a population of control origin cells, identifying candidate markerswhich are expressed in the population of control origin cells but notthe population of down-regulated origin cells, and comparing expressionof the candidate markers in populations of control origin cells,cancerous origin cells and destination cells, wherein a candidate markerwhich is expressed in population of control origin cells and cancerousorigin cells, but not the population of destination cells is a usefulmarker for the detection of cancer metastasized from the origin tissueto the destination tissue. The method may comprise the additional stepof isolating the molecular marker. The method may also comprise theadditional steps of identifying the transcription factor that binds toregulatory regions of a gene associated with terminal differentiation ofthe origin tissue.

BRIEF DESCRIPTION OF THE DRAWINGS

[0011]FIG. 1. Functional characterization of deletion mutants of thehuman GC-C gene promoter. Deletion mutants of the GC-C gene 5′-flankingregion were linked to luciferase and co-transfected with the Renillaluciferase control plasmid pRL-TK into intestinal (T84, Caco2) andextra-intestinal (HepG2, HeLa, HS766T) cell lines. Data are expressed asluciferase activity relative to the pGL3 Basic promoterless construct(Relative Activity). Each bar represents the mean ± the standard errorof at least 3 independent transfections performed in duplicate.

[0012]FIG. 2. DNAse I protection of the proximal human GC-C promoter.Footprinting reactions included the indicated mg quantities (NE) ofHepG2 or T84 nuclear extract and the −46 to −257 promoter fragmentlabeled at the 5′-end of the coding strand. A control digestioncontained 60 mg of bovine serum albumin (BSA). Protected bases wereidentified by a Maxam-Gilbert sequencing reaction (G+A) of the labeledfragment. The sequence of FP1 is given. Arrowhead indicates DNAse Ihypersensitivity site at base −163.

[0013]FIG. 3. Regulation of reporter gene expression byintestine-specific protected elements. FP1 and FP3 were deleted from the−835 luciferase construct by in vitro mutagenesis, and wild-type anddeletion constructs were expressed in HepG2 and T84 cells. Results areexpressed as luciferase activity relative to a promoterless constructand represent the mean ± the standard error of 3 independenttransfections performed in duplicate.

[0014]FIG. 4. Intestinal specificity of FP1 probe EMSA. Nuclear extractsfrom intestinal or extra-intestinal cells, or BSA (10 mg), wereincubated with labeled FP1 for 30 min. at room temp prior to separationon a non-denaturing 6% polyacrylamide gel.

[0015]FIG. 5. Cdx2 binding element FP1 is required for GC-C reportergene activation. Putative binding sites for Cdx2 and HNF-4a areindicated on the −835 construct. T84 and HepG2 cells were transfectedwith the −835 reporter construct from which FP1 was deleted, or thatconstruct containing the ‘CCC’ mutation. Results are expressed as(luciferase activity of mutant construct, luciferase activity ofwildtype construct) ×100, and represent the mean ± the standard error of3 independent transfections performed in duplicate. The values expressedas relative luciferase activities are, respectively, (wildtype; FP1deletion; ‘CCC’ mutation): T84 (16.2±2.7; 1.9±0.3; 2.3±0.1) and HepG2(2.1±0.1; 2.9±0.3; 2.2±0.1).

DESCRIPTION OF PREFERRED EMBODIMENTS

[0016] The present invention relates to methods to identify andcharacterize molecular markers useful for detecting metastasized tumorcells. Most commonly, molecule markers used to detect tumor cells aretranscripts or proteins specifically expressed as a result of thehyperproliferative state of the cell. In contrast, the molecular markersthat are identified and characterized by the method of the presentinvention are specifically expressed in terminally differentiatedtissues and are not specific to tumor cells. Tumor cells continue toexpress the genes associated with terminal differentiation of theirtissue of origin. The transcripts and proteins of these genes areideally suited to detect tumor cells that have metastasized to adestination tissue, such as a lymph node, because the origin tissuespecific markers will be out of place in the destination tissue. Becausethese molecular markers are specific to the origin tissue and not aparticular tumor, they will broadly recognize many tumors metastasizedfrom the origin tissue.

[0017] The method for identifying molecular markers useful for detectingmetastasized tumor cells identifies “candidate” tissue-specific moleculemarkers and determines which of these candidate markers are suitable forthe detection of metastatic cancer. Tissue-specific markers associatedwith the terminal differentiation of a desired origin tissue arecharacterized by down-regulating the activity of a transcription factorassociated with terminal differentiation of origin tissue, comparing theexpression profiles of the down-regulated origin tissue with unalteredcontrol origin tissue, and identifying transcripts or proteins that arecandidate tissue-specific markers by virtue of their expression beingup- or down-regulated in conjunction with the down-regulation of thetranscription factor. The expression of the candidate tissue-specificmarkers are compared in the control origin tissue, tumors derived fromthe origin tissue, and destination tissues of interest for biopsy.Candidate markers that are expressed in control origin tissue andtumors, but not destination tissue are useful markers for detectingmetastatic tumor cells.

[0018] As used herein, the term “terminal differentiation” refers to adifferentiation state of a cell or tissue from which no furtherdifferentiation can occur.

[0019] As used herein, the term “metastasize” refers to the processwhereby cancer cells break loose from a tumor mass and form secondarytumors or metastases at other sites of the body.

[0020] As used herein, the term colorectal tract refers to the tissuesand organs than comprise the large intestine to and including therectum. These tissues and organs comprise, but are not limited to, theterminal ileum and ileocecal valve, the cecum, the ascending,transverse, descending and sigmoid colon, the rectum and the analsphincters.

[0021] The origin tissue of the invention is any terminallydifferentiated tissue of the body in which tumor cells first arise. By“arise”, it is meant to confer to cells the hyperproliferative phenotypeassociated with tumor cells. The origin tissue is preferably a tissuefrom which cancer cells are most likely to metastasize. In a preferredembodiment, the tissue is mammalian, and in a most preferred embodiment,the tissue is human. In preferred embodiments, the origin tissueincludes, but is not limited to, colorectal, intestine, stomach, liver,mouth, esophagus, throat, thyroid, skin, brain, kidney, pancreas,breast, cervix, ovary, uterus, testicle, prostate, bone, muscle, bladderand lung. It is particularly advantageous to use established cell linesin the method of the invention. The cell lines of particular interestrepresent terminally differentiated cells of the origin tissue,including embryonic tissue cell lines and immortalized cell lines(Yeager and Reddel, 1999, Curr. Opin. Biotechnology 10:465-469). Celllines of particular interest include, but are not limited to, T84,Caco2, HT29, SW480, SW620, NCI H508, SW1116, SW1463, Hep G2, HS766T, andHeLa cells. These and additional cell lines of origin tissue may beobtained from the American Type Culture Collection (Manassas, Va.), aswell as from commercial sources.

[0022] Cancerous origin tissues are isolated from tumors that arise inthe origin tissue. Cancerous cells may be obtained by removing tumorsfrom patients. Established populations of tumor tissue, i.e. cell linesof tumor cells, can be used to advantage in the method of the invention.Cancer cell lines of interest include, but are not limited to, T84,Caco2, HT29, SW480, SW620, NCI H508, SW1116, SW1463, Hep G2, HS766T, andHeLa cells. These cell lines and other useful cell lines may be obtainedfrom the American Type Culture Collection (Manassas Va.), as well asfrom commercial sources.

[0023] The destination tissue of the invention is any tissue or bodilyfluid that may be biopsied to detect metastasized tumor cells. Severaltissues of the body are well known to those in the art for theirpropensity to accumulate metastasized tumor cells, and these tissues arepreferred for the destination tissue. However, the destination tissuemay be any tissue of the body. Destination tissues of particularinterest include, but are not limited to, lymph node, blood, cerebralspinal fluid, and bone marrow. Additional cell lines for origin tissuecells may be obtained from the American Type Culture Collection(Manassas, Va.), as well as from commercial sources. Preferably, biopsyor resected tissue is used as the destination tissue.

[0024] The transcription factors used in the method of the invention aretranscription factors that are associated with terminal differentiationof the origin tissue. Many such transcription factors are already knowto those skilled in the art. In preferred embodiments, the transcriptionfactor is associated with the terminal differentiation of a preferredorigin tissue. In preferred embodiments, the transcription factorsinclude, but are not limited to, Cdx2 (intestine) (Mallo, G. V. et al.,1997 Int J Cancer 74:35-44; Genbank Accession No. BF591065), STAT5(breast) (Hou, J. et al., 1995 Immunity 2:321-329; Genbank Accession No.L41142), NKX3.1 (prostate) (Genbank Accession No. AF247704), GBX2(prostate) (Lin, X. et al., 1996 Genomics 31: 35-342; Genbank AccessionNo. NM U13219), FREAC-2 (lung) (Pierrou, S. et al., 1994 EMBO J.13:5002-5012; Genbank Accession No. U13220), Pit1 (thyroid) (Wu, W. etal., 1998 Nat Genet 18:147-9; Genbank Accession No. NM 006261) HNF4(liver) (Chartier, F. L. et al., 1994 Gene 147:269-272; Kritis, A. A. etal., 1996 Gene 173:275-80; Genbank Accession Nos. X76930, X87870,X87872, X87871), LFB1 (liver) (Bach, I. et al., 1990 Genomics 8:155-164;Genbank Accession No. NM 000545 ), IPF1 (pancreas) (Stoffel, M. et al.,1995 Genomics 28:125-126;Genbank Accession Nos. NM 000209, U30329), Is11(pancreas) (Wang, M. and Drucker, D. J., 1994 Endocrinology134:1416-1422; Genbank Accession Nos. XM 003669, NM 002202 ) and MyoD(muscle) (Pearson-White, S. H., 1991 Nucleic Acids Res. 19:1148; GenbankAccession No. X56677 ), all of which are incorporated by referenceherein.

[0025] The method of the present invention may, in some embodiments,further comprise steps to identify a transcription factor geneassociated with terminal differentiation. These additional stepscomprise identifying the transcription factor that binds to theregulatory regions of a gene associated with terminal differentiation inthe origin tissue. There are many protocols currently available andknown to those skilled in the art to characterized transcription factorsand transcription factor genes. In a preferred embodiment,electromobility shift assays and/ or supershift assays are used tocharacterize the transcription factor that binds to the regulatoryregion of a gene whose expression is associated with terminaldifferentiation. Example 1 illustrates the characterization oftranscription factor Cdx2 by its binding to the regulatory regions ofthe gene encoding the intestine-specific protein guanylyl cyclase C.

[0026] In the method of the invention, the activity of transcriptionfactor associated with terminal differentiation is “down-regulated” in apopulation of origin tissue cells. By “down-regulated”, it is meant thatthe activity of the transcription factor is reduced in the cellpopulation as compared to a “normal” or control cell population. As usedherein, a “cell population” refers to a cell culture, tissue culture,resected tissue or biopsy sample, or any group of cells from the desiredtissue type. A population of normal or control origin cells refers is apopulation of origin cells from the culture of origin tissue cells usedfor down-regulating the transcription factor, but without modificationof the activity of the transcription factor.

[0027] The activity of the transcription factor may be down-regulated incell populations by several means well known to those in the art. Insome embodiments, the transcription factor gene is down regulated bysite-directed mutagenesis of the coding or regulatory regions of thegene, or the transcription of an antisense gene constructed from thecoding sequence of the transcription factor gene. Alternately, in otherembodiments, the activity of the transcription factor is blocked orinhibited by specific antibodies, DNA-binding molecules, or smallmolecules that interfere with the activity of the transcription factorby interfering with the assembly and/or initiation of thetranscriptional complex. Inhibitor polynucleotide molecules of interestinclude, but are not limited to, FP1, FP1B and SIFI (see Example 1).Finally, in other embodiments, the transcription factor may bedown-regulated by activating a signaling event that inactivates thetranscription factor, such as the addition of an extracellular ligandthat initiates a cell-signaling event that phosphorylates andinactivates the transcription factor. These methods will be well knownby those skilled in the art, and protocol can be found in manylaboratory manuals, such as Ausubel et al. Current Protocols inMolecular Biology. New York: John Wiley & Sons, Inc., 2000. Theseembodiments are meant to illustrate methods by which to generatedown-regulated origin cells. Other manners of down-regulation will bewell known to those skilled in the art and are included in the scope ofthe method of the present invention.

[0028] In a preferred embodiment, the down-regulated origin cells arecdx2-null polyps. Cdx2-null polyps can be resected from a mouse that isheterozygous for an inactive copy of the homeobox gene cdx2, whichcontrols cell differentiation in the intestinal epithelium(Chawengsaksophak et al., 1997, Nature 386:84-87; Tamai et al., 1999,Cancer Res. 59:2965-2970; Beck et al., 1999, PNAS 96:7318-7323;incorporated by reference herein). Cdx2 stimulates the markers ofendocyte differentiation. These heterozygous mice develop multipleintestinal polyp-like lesions that do not express active Cdx2 and theCdx2-related markers. In this embodiment, the comparison of theexpression profiles of Cdx2-null polyps with surrounding intestinaltissue will identify the Cdx2 stimulated markers of endocytedifferentiation.

[0029] The method of the invention comprises the step of comparing theexpression profile of the population of down-regulated origin cells withthe expression profile of the population of control origin cells. By“expression profile” it is meant the array of nucleic acids or proteinsthat are expressed in a cell population. Most commonly, expressionprofiles are arrays of nucleic acid molecules, primarily mRNA molecules,that are found in the profiled cell population. Methods to compare RNAexpression profiles are well known to those in the art. Some methods ofparticular interest include, but are not limited to, differentialdisplay (Welsh et al., 1992, Nucleic Acids Res. 20:4695-4970; Liang andPardee, 1992, Science 257:967-970; Barnes, 1994, Proc. Natl. Acad. Sci.USA 91:2216-2220; Cheng et al., 1994, Proc. Natl. Acad. Sci. USA 91:5695-5699; and the references cited therein), subtractive hybridization(Diatchenko et al., 1996, Proc. Natl. Acad. Sci. USA 93:6025-6030;Gurskaya et al., 1996, Anal. Biochem. 240:90-97; Endege et al., 1999,Biotechniques 26: 542-550; and the references cited therein), expressionarrays (Schena et al., 1995, Science 270: 467-470; Shalon et al., 1996,Genome Res. 6: 639-645; Cheung et al., 1999, Nature Genetics 21(Suppl.):15-19; and the references cited therein), Serial Analysis of GeneExpression (SAGE) (Velculescu et al., 1995, Science 270: 484-487; Zhanget al., 1997, Science 276: 1268-1272; Adams et al., 1996, Bioessays 18:261-262; and the references cited therein), Rapid Analysis of GeneExpression (RAGE) (Wang et al., 1999, Nucleic Acids Res. 27: 4609-4618;and the references cited therein), Massively Parallel SignatureSequencing (MPSS) (Brenner et al., 2000, Nature Biotech. 18: 630-634;and references therein) and Tandem Arrayed Ligation of ExpressedSequence Tags (TALEST) (Spinella et al., 1999, Nucleic Acids Res. 27:e22 (I-VIII); and references therein).

[0030] Many of the aforementioned techniques may be preformed usingcommercially available kits, reagents and apparatuses. Commercial kitsfor differential display may be purchased, such as the Delta®Differential Display Kit (Clontech, Palo Alto, Calif.), among others.Commercial kits for subtractive hybridization may be purchased, such asClontech PCR-Select® Subtraction (Clontech, Palo Alto, Calif.), amongothers. Micro-arrays of popular cDNA populations may be purchased(Incyte Genomics, Inc, St. Louis. Mo.), or custom micro-arrays may beordered from commercial sources (Radius Biosciences, Medfield Mass.;ProtoGene Laboratories, Inc., Menlo Park Calif.). A preferredmembrane-format microarray is LifeGrid™ Sequence-Verified GeneExpression Array Kits (Incyte Pharmaceuticals, Inc., St. Louis, Mo.) anda preferred slide-format microarray is ®GEM® Gene Expression Microarray(Incyte Pharmaceuticals, Inc., St. Louis, Mo.). Commercial kits for RAGEare available from Kirkegaard & Perry Laboratories, Inc. (Gaithersburg,Md.). GeneTag®, a proprietary technology developed by Celera Genomics(Rockville, Md.), may also be used to quantify gene expression in aprofile of RNA transcripts.

[0031] Protein expression profiles may also be compared by methods thatwill be well known to those in the art. Methods of particular interestinclude, but are not limited to, 2-Dimensional Electrophoresis - MassSpectroscopy (2DE-MS) (O'Farrell, 1975, J. Biol. Chem. 250: 4007-4021;Patterson and Aebersold, 1995, Electrophoresis 16: 1791-1814; Gygi etal., (2000) Curr. Opinion in Biotech. 11: 396-401; and refernces citedtherein) and Isotope-Coded Affinity Tags (ICAT) (Gygi et al., 1999,Nature Biotech. 17: 994-999; Gygi et al., 2000, Curr. Opinion inBiotech. 11: 396-401; and references cited therein).

[0032] Nucleic acid molecules or protein molecules of interestidentified by the comparison of expression profiles may additionally beisolated using methods that will be well known to those skilled in theart. The isolation method chosen depends in many cases on the methodused to compare the expression profiles, and the preferred method willoften be described in the reference that describes the method ofcomparison (see aforementioned citations). For example, nucleic acidbands may be removed from a polyacrylamide gel, agarose gel ornitrocellulose, the nucleic acids eluted and cloned using techniqueswell known in the art (Ausubel et al. Current Protocols in MolecularBiology. New York: John Wiley & Sons, Inc., 2000).

[0033] The method of the invention comprises the step of comparing theexpression of the candidate markers in several kinds of cells. There aremany methods to compare the expression of single genes which will bewell know to those in the art (Ausubel et al. Current Protocols inMolecular Biology. New York: John Wiley & Sons, Inc., 2000), includingbut not limited to, northern analysis, Southern analysis with cDNA,RNase protection assays, quantitative PCR, competitive PCR, 5′ nucleaseassays (Lie and Petropoulos, 1998, Curr. Opin. Biotech. 9:43-48 and thereferences cited therein), western analysis, dot blot western, ELISA andother immunoassays, and immunohistochemistry.

[0034] The molecular markers identified by the method of the inventionmay be used to diagnose and stage cancer in mammalian patients,including following the development of recurrence of cancer aftersurgery and screening normal patients for the development of cancer. Inthe case of cancer patients, the molecular markers utilized would beidentified ideally from the same tissue that the patients cancer arose.In the case of patients without a history of cancer, a selection ofmolecular markers isolated from different origin tissues is preferred.The metastases may be diagnosed by any technique that will detect thenucleic acid or protein molecular marker. The sensitively of thetechnique will determine in part the size of metastasis that can bedetected. Preferred techniques utilize PCR, ELISA, and the like. Example2 illustrates a particularly preferred method to diagnose metastasizedcancer with the molecular markers of the method.

[0035] Tissue specific molecular markers can also be utilized tolocalize therapeutics to specific tissue and organ systems. This use isparticularly appropriate for tissue-specific molecular markers that arelocalized on the surface of the tissue cells. These therapeuticsinclude, but are not limited to, chemotherapeutics, analgesics,antibiotics, anti-inflamatories, hormones and stimulants.

[0036] Protein molecular markers may be used to generate antibodies thatmay be used in diagnosis method and to localize therapeutics. Polyclonalantibodies and monoclonal antibodies, and fragments thereof, and variousconjugates of them can be made by methods well known in the art.

Example 1 Cdx2 is a Transcription Factor Associated with theIntestinal-specific Expression of Guanylyl Cyclase C

[0037] This illustrates the identification of a transcriptionalactivating factor required for intestine-specific expression of guanylylcyclase C (GC-C). A region of the proximal GC-C promoter required forspecific expression in intestinal cells that contains a protectedregion, FP1, with a consensus binding sequence for Cdx2. FP1 formed acomplex specifically with nuclear proteins only from intestinal cells,and this complex was recognized by anti-Cdx2 antibody. Elimination ormutation of the Cdx2 consensus binding sequence within FP1 reducedreporter gene activity in intestinal cells to that obtained inextra-intestinal cells. These data suggest that Cdx2 activatestissue-specific transcription of GC-C.

Materials and Methods

[0038] Genomic Library Screening and Sequencing. The GC-C gene 5′regulatory region was cloned from a KFIXII human genomic library(Stratagene, La Jolla Calif.). The library was screened by hybridizationwith a probe specific for exon 1 of the guanylyl cyclase C (GC-C) cDNA.A 2.8 kb Xbal fragment that included 2 kb upstream of the start site oftranscription was subcloned into Bluescript KS (Stratagene). Allconstructs were generated from this Bluescript/human GC-C geneconstruct. The nucleic acid sequence of each construct was confirmed byBigDye terminator® reaction chemistry for sequence analysis on theApplied Biosystems Model 377 DNA sequencing systems (Perkin-Elmer,Norwalk CN; Applied Biosystems, Foster City Calif.).

[0039] Reporter Gene Constructs. Fragments −835 to +117,−257 to+117,−129 to +117, and −46 to +117, relative to the start site oftranscription, were isolated from Bluescript KS constructs by digestionwith selected restriction endonucleases (Mann et al., 1996, BiochimBiophys Acta 1305:7-10). These fragments were blunt-ended and ligatedinto the EcoRV site of Bluescript KS. Inserts were excised fromBluescript KS with Smal and Kpnl and ligated into the pGL3-BasicLuciferase Vector (Promega, Madison Wis.). The pGL3 Control Vectorcontaining an SV40 promoter with enhancers, was used as a positivecontrol.

[0040] Mutations were created in the −835 to +117 pGL3 constructutilizing the PCR-based Ex-site Mutation Kit (Stratagene). Deletionconstructs were created using primers flanking the sites of interest.The FP1 “CCC” mutant was created using the phosphorylated primers:

[0041] 5′ GCCCATAGCTCTGACCTTTCTG 3′ (SEQ ID NO:1) and

[0042] 5′AGAGAGATTAGCTGGGCCTCACCC 3′ (SEQ ID NO:2).

[0043] Cell Culture and Transfection. All cell lines were obtained fromAmerican Type Culture Collection (Rockville, Md.). T84 cells were grownin DMEM/F12 (Life Technologies, Rockville Md.), Caco2 cells in DMEM(Life Technologies), HepG2 and HS766T cells in DMEM High Glucose(Cellgro®, Mediatech, Inc., Herndon Va.), and HeLa cells in MEM withglutamine (Life Technologies). All cell lines were maintained at 37° C.in a 5% CO₂/95% air atmosphere and passaged every four days. Assays ofreporter gene activity were conducted with cells plated in 6-well seededat either 5.0×10⁵ (T84, Caco2, and HeLa) or 1.0×10⁶ cells per well(HepG2 and HS766T). Cells were incubated overnight, washed one time withPBS, and supplemented with fresh media before transfection.

[0044] Plasmids purified with the Qiafilter Kit (Qiagen, ValenciaCalif.) were transfected into cells with the non-liposomal lipidtransfection reagent Effectene® (Qiagen). All cell lines wereco-transfected with both 0.4 mg of firefly luciferase experimentalreporter constructs, modified from pGL3-Basic, and 0.1 mg of the Renillaluciferase control reporter, pRL-TK, driven by a viral thymidine kinasepromoter (Promega). Cells were incubated with transfection complexes for24 h, rinsed with PBS, then supplemented with appropriate media andincubated for a further 24 h. After a total of 48 h, cells were lysedand assayed using the protocol and materials in the Dual-LuciferaseReporter Assay system (Promega). Luminesence was measured with aBioOrbit 1251 Luminometer (Pharmacia LKB, Uppsala Sweden). Luciferaseexpression from pGL3 constructs was normalized to pRL-TK expression.

[0045] Nuclear Protein Extraction. Nuclear extracts were preparedessentially as previously described (Ausubel et al. Current Protocols inMolecular Biology. New York: John Wiley & Sons, Inc., 2000). Nuclearprotein concentration was determined using Coomassie Protein AssayReagent (Pierce, Rockford Ill.).

[0046] DNAse I Footprinting. A fragment of the GC-C gene regulatoryregion −46 to −257 relative to the start of transcription was obtainedby digestion with DraIII and AflII, blunt-ended, and subcloned into theBluescript® KS EcoRV site, as described above, and then digested withEcoRI and HinDIII to ensure that the coding strand of the probe wassingly end-labeled with [α-³²P]dCTP. Products obtained from footprintingreactions were separated on a denaturing 6% polyacrylamide gel andvisualized by a Phosphorimager SI (Molecular Dynamics, Sunnyvale,Calif.).

[0047] Electromobility Shift Assay (EMSA). Protein-DNA binding reactionsperformed in the same buffer as the DNase I protection assay (4%glycerol, 10 mM Tris-HCl (pH 7.5) 50 mM NaCl, 2.5 mM MgCl₂ and 5 mM DTT)included 1 mg of Poly(dI.dC)-Poly(dI.dC) (Amersham Pharmacia Biotech,Piscataway, N.J.) and 30 kcpm of probe. Reactions were initiated by theaddition of nuclear extract and incubated for 30 min at room temp toproduce protein complexes which were separated on a 6% non-denaturing,polyacrylamide (37.5:1) gel in 0.5× TBE running buffer. Gels were driedprior to visualization of radiolabelled complexes by autoradiography. Incompetition assays, unlabelled competitor was added to the reactionmixtures at concentrations ranging from 25-fold to 250-fold molar excessof the labeled probe prior to the addition of the nuclear extract.Supershift assays were performed by adding 2 ml of murine Cdx2 antibodyafter an initial incubation period of 30 min; incubation was thencontinued for an additional 30 min. Transcribed and translated murineCdx2 protein was generated in vitro using linearized pRc/CMV-Cdx2expression vector as a template for the TNT-Quickcoupled Kit (Promega).

[0048] Oligonucleotide probes for EMSA were synthesized. Complementaryoligonucleotides in 10 mM Tris-HCl (pH 7.5), 1 mM EDTA were annealed ina Hybaid Thermal Cycler by a programmed ramp in temp from 95° C. to 25°C. over the course of 1 h. The single stranded sequences of the probeswere:

[0049] FP1: 5′ CAGCTAATCTCTCTGTTTATAGCTCTGACCTTTC 3′ (SEQ ID NO:3)

[0050] FP1B: 5′ ATCTCTCTGTTTATAGCTCTGACCTTTCTGGGTGC 3′ (SEQ ID NO:4)

[0051] FP1-CCC: 5′ CAGCTAATCTCTCTGCCCATAGCTCTGACCTTTC 3′ (SEQ ID NO:5)

[0052] SIF1: 5′ GATCCGGCTGGTGAGGGTGCAATAAAACTTTATGAGTA 3′ (SEQ ID NO:6)

[0053] Bolded sequences indicate specific Cdx2 binding sites. A mutationcreated in the FP1 protected site is underlined. Five pmol of annealedoligonucleotide probe were end-labeled employing 1 unit of T4polynucleotide kinase and 2 ml of 7,000 Ci/mmol [γ-³²P]ATP (Ausubel etal. Current Protocols in Molecular Biology. New York: John Wiley & Sons,Inc., 1999). Labeled probes were purified over Qiaquick nucleotidepurification columns (Qiagen).

[0054] Southwestern and Western Blotting. Nuclear extracts weredenatured in reducing SDS sample buffer, separated on an 8%Tris-glycine-SDS polyacrylamide gel, and transferred to nitrocellulose.For Southwestern analysis, the blotted proteins were blocked for 1 h at4° in Z′ buffer (25 mM Hepes-KOH (pH 7.6), 12.5 mM MgC₁₂, 20% glycerol,0.1% Nonidet P-40, 100 MM KCl, 10 mM ZnSO4, 1 mM DTT) containing 3%non-fat dry milk (Hames and Higgins. Gene Transcription: A PracticalApproach. The Practical Approach Series. New York: Oxford UniversityPress, 1993.). The membrane was rinsed for 5 min in EMSA binding bufferand hybridized with 20 ml of EMSA binding buffer with 100 kcpm/ml oflabeled FP1 probe for 1 h at room temp. The membrane was then washed for5 min each in three changes of EMSA binding buffer, dried and visualizedby autoradiography.

[0055] Western blots were blocked in TBS/0.1% Tween-20 with 5% non-fatdry milk, and probed with Cdx2 antibody diluted 1:5000. Binding ofprimary antibody was visualized using goat anti-rabbit alkalinephosphatase-conjugated secondary antibody diluted 1:10,000 (Sigma).Alkaline phosphatase substrates BCIP and NBT were used in an AP ColorKit (Biorad).

Results

[0056] Determination of elements controlling intestine-specificexpression in the 5′ regulatory region of the GC-C gene. Minimalluciferase activity was obtained when various cell lines weretransfected with the −46 construct (FIG. 1). In contrast, luciferaseactivity increased in intestinal cells transfected with each of theother reporter gene constructs (FIG. 1). Luciferase activity did notincrease when extra-intestinal cells were transfected with theseconstructs (FIG. 1). These results are consistent with previous studiesof GC-C gene regulation, and suggest that there are one or moretissue-specific regulatory elements within the +118 to −257 region. 12Since transfection with the −46 to −129 construct resulted in asignificant increase in activity of the reporter gene in intestinalcells only (FIG. 1), and since this region is highly conservedevolutionarily, it was chosen for detailed structure-function analysis.

[0057] DNAse I protection by intestine-specific nuclear protein bindingto the 5′ regulatory region of GC-C. DNAse I protection assay revealedtwo regions (−75 to −83, FP1; −164 to −178, FP3) which were protectedonly by nuclear extracts from intestinal cells (T84; FIG. 2). Regions−104 to −137 (FP2) and −180 to −217 (FP4) were protected by nuclearextracts from either intestinal (T84) or extra-intestinal (HepG2) cells,although the proximal and distal ends of FP2 exhibited differentpatterns of protection. These data suggest that the protected regionsdesignated FP1 and FP3 were specific binding sites for nuclear proteinsfrom intestinal cells. In addition, an intestine-specific site of openchromatin structure in the proximal 5 ′-flanking region of the GC-C genewas identified by a DNAse I hypersensitive site at base −163 (FIG. 2).

[0058] Transcriptional activity of the -857 construct following deletionof FP1 or FP3. Transfection of T84 cells revealed that deletion of FP3increased luciferase activity 2.5-fold relative to the wild-typeconstruct (FIG. 3). In contrast, elimination of FP1 reduced luciferaseactivity in T84 cells to levels observed in HepG2 cells (FIG. 3). Thesedata suggest that FP3 contains a negative regulatory element, and thatFP1 contains an intestine-specific positive regulatory element. Analysisby TRANSFAC (Heinemeyer et al., 1998, Nucleic Acids Res. 26: 364-370), adatabase of transcription factor binding sites, revealed that FP1contains the consensus binding site for the homeodomain protein Cdx2(Quandt et al., Nucleic Acids Res 1995; 23:4878-84). Since Cdx2 is atranscription factor that directs intestine-specific expression ofseveral genes, FP1 was more closely examined (Traber and Silberg, 1996,Annu Rev Physiol 58:275-97).

[0059] Specific complexes are formed by intestinal nuclear extract andFP1 probe. The ability of the protected site FP1 to formintestine-specific complexes was determined by incubating anoligonucleotide probe with nuclear extracts prepared from T84, Caco2,HepG2, or HeLa cells. Indeed, several complexes were obtained by EMSAwhen the FP1 probe was incubated with nuclear extracts from those cells(FIG. 4). However, only one complex satisfied criteria for intestinalspecificity, including formation by nuclear extracts from T84 and Caco2cells, but not from HepG2 or HeLa cells. Extracts from T84 and Caco2cells, but not from HepG2 or HeLa cells, also formed complexes with SIF1that were identical to those obtained previously with that probe,demonstrating the integrity of the extracts (Suh et al., 1994, Mol CellBiol 14:7340-51). All of the EMSA complexes formed with T84 nuclearextracts were competed with increasing amounts of unlabelled FP1 probein a concentration-dependent manner. In contrast, an unlabelledcompetitor in which the Cdx2 binding site was specifically mutated(FP1-CCC probe, see Materials and Methods) did not compete against theintestine-specific complex. SIF1, an oligonucleotide containing twoconsensus binding sites for Cdx2, selectively prevented the formation ofthe FP1-dependent intestine-specific complex with greater potency thanunlabelled FP1, but generally did not affect the binding of theremaining T84-EMSA complexes (Suh et al., 1994). These data suggest thatthe intestine-specific factor that binds to the FP1 protected site ismost likely Cdx2.

[0060] Cdx2 binds specifically to the FP1 probe. To determine whetherFP1 is a binding site for Cdx2, labeled FP1 was incubated with in vitrotranscribed and translated murine Cdx2. This resulted in a complex whosemobility was identical to the intestine-specific complex formed by T84nuclear extract. In contrast, labeled FP1-CCC did not form theintestine-specific complex with either Cdx2 or T84 nuclear extract. Anantibody against Cdx2 decreased the mobility of the specific complexformed between labeled FP1 and either T84 nuclear extract or in vitrotranscribed and translated Cdx2. In contrast, an antibody against arelated homeodomain transcription factor, Cdx1, did not alter themobility of the intestine-specific complex. These data lead to theconclusion that the FP1 protected site is a binding site for Cdx2.

[0061] Identification of the intestine-specific nuclear factor bySouthwestern and Western blots. Whether the FP1 probe and anti-Cdx2antibody bound to the same intestine-specific protein was examined.Labeled FP1B, which is highly homologous to FP1 probe, specificallybound to an intestine-specific protein of ˜40 kDa in T84 and Caco2, butnot HepG2, nuclear extracts. In addition, FP1B probe bound to a ˜131 kDaprotein present in all cell lines examined. Similarly, anti-Cdx2antibody recognized a protein doublet of ˜40 kDa expressed in T84, butnot in HepG2 or HeLa, cell nuclear extracts, a pattern which ischaracteristic of Cdx2 (James et al., 1994, J Biol Chem 269:15229-37).Thus, the FP1 protected region binds to an intestine-specific factor ofthe same molecular weight and antigenic recognition as Cdx2.Furthermore, Southwestern blots revealed that FP1 probe binds directlyto Cdx2.

[0062] Role of the Cdx2 binding element (FP1) in intestine-specific geneexpression of the GC-C promoter. The ‘CCC’ mutation was introduced intothe FP1 element of the −835 luciferase reporter gene construct. Thismutated reporter gene construct exhibited reduced activity in T84 cellsthat was comparable to the construct from which the entire FP1 regionwas deleted (FIG. 5). Neither the FP1 deletion nor the ‘CCC’ mutation inFP1 altered luciferase expression in HepG2 cells (FIG. 5). These datademonstrate that an intact Cdx2 binding site is required for activity ofthe GC-C promoter. Indeed, disruption of the Cdx2 binding site resultedin minimal activity.

Example 2 Guanylyl Cyclase C Messenger RNA is used as a Molecular Markerto Detect Recurrent State II Colorectal Cancer

[0063] This example illustrates the use of a tissue-specific moleculemarker to diagnose metastases. Detection of GCC mRNA by RT-PCR enhancesthe accuracy of colorectal cancer staging. The expression in lymph nodesof GCC mRNA, a molecular marker for colorectal cancer cells inextraintestinal tissues, is associated with disease recurrence inpatients with histologically negative nodes (stage II). Expression ofGCC mRNA reflects the presence of colorectal cancer micrometastasesbelow the limit of detection by standard histopathology. GCC-specificRT-PCR can reliably and reproducibly detect a single human colorectalcancer cell (T84 cells, ATCCC, Rockville, Md.) in 10⁷ nucleated bloodcells (Carrithers et al., 1996, Proc Natl Acad Sci USA, 93:14827-32).

[0064] GCC, a member of the guanylyl cyclase family of receptors, isspecifically expressed only in intestinal mucosal cells. However, GCCexpression persists in intestinal cells that undergo neoplastictransformation to colorectal cancer cells. Examination of >300 surgicalspecimens demonstrated that GCC was specifically expressed by allprimary and metastatic colorectal cancer cells, but not by any otherextraintestinal tissues or tumors. GCC is identified only in lymph nodesfrom stage II patients who suffered recurrence ≦3 y, but not in lymphnodes from patients without recurrent disease 6 y, following diagnosis.

Materials and Methods

[0065] Patients and tissues. The Thomas Jefferson University Hospitaltumor registry database was examined for patients who had undergonetreatment for colorectal cancer between 1989 and 1995, an intervalpermitting adequate follow-up of patients for this study. This initialsearch was designed to exclude patients with recurrent disease >3 yfollowing index surgery to avoid inadvertent inclusion of patients withmetachronous, rather than recurrent, cancer. This search yielded 445patients with invasive colon or rectal carcinoma with no evidence ofmetastases (N₀M₀) at the time of surgery. Of these, 260 patientsunderwent surgery at Thomas Jefferson University that yielded lymphnodes. Subsequently, 167 patients were excluded because they had TNMstage I disease or less (T₀, T₁ or T₂N₀M₀), developed recurrent diseaselocally or at unspecified sites, or received neoadjuvant chemo- orradiotherapy. Fifty-six patients with no evidence of recurrence werethen excluded because they had <6 y of follow up. After theseexclusions, a total of 18 patients with no evidence of disease for ≧6 yfollowing surgery and considered clinically cured remained. Thesepatients formed the control group. Similarly, all 19 patients whodeveloped metastases ≦3 y following surgery were included in the casegroup. Sixteen patients in the control group and 12 patients in the casegroup had pathology specimens available for further analysis. Twopatients in the control group (patients 9 and 16; 12.5%) and 1 patientin the case group (patient 24; 8.3%) received 5-fluorouracil-basedadjuvant chemotherapy following surgery.

[0066] Reverse transcriptase-polymerase chain reaction. Preliminarystudies demonstrated that mRNA isolated from 10 μm sections fromindividual lymph nodes yielded insufficient RNA for RT-PCR analyses.Consequently, at least five 10 μm sections of representative lymph nodesfor each patient were pooled and deparaffinized, and the total RNAisolated (Waldman et al. 1996, Dis Colon Rectum 41:1-6.). RT-PCR wasperformed employing RNA PCR kit ver.2 (Takara Shuzo Co., Ltd., Kyoto,Japan; Carrithers et al., 1996, Proc Natl Acad Sci USA 93:14827-32;Waldman et al., 1996, Dis Colon Rectum 41:1-6). Only total RNA thatyielded amplicons following β-actin-specific RT-PCR was employed instudies outlined below. GCC-specific and nested carcinoembryonicantigen-specific RT-PCR was performed as described previously(Carrithers et al., 1996, Proc Natl Acad Sci USA 93:14827-32; Waldman etal., 1996, Dis Colon Rectum 41:1-6; Liefers et al., 1998, New Engl J Med1998;339:223-8). RT-PCR reactions were separated by electrophoresis on4% NuSieve 3:1 agarose® (FMC Bioproducts, Rockland, Me.) andamplification products visualized by ethidium bromide. Positivecontrols, consisting of RNA isolated from human colorectal cancer cellsexpressing GCC and carcinoembryonic antigen (Caco2 cells; American TypeCulture Collection, Rockville, Md.) and negative controls, consisting ofincubations in which no template was added and RNA from lymph nodesdevoid of colorectal cancer, were included. Amplicon identity wasconfirmed by sequencing. Production of GCC-specific amplicons wasconfirmed by Southern analysis, employing a ³²P-labeled antisense probecomplimentary to a sequence internal to primers used for amplification(Kroczek, 1993, J Chromatog 618:133-145).

[0067] Statistical analysis. Results are expressed as the mean ± SDexcept disease-free and overall survival, which are expressed as themedian ± range. P values were calculated using Fisher's Exact test. Theodds ratio with exact 95% confidence interval (CI) was calculatedemploying the StatXact 4.0 statistical software package (CYTEL SoftwareCorp., Cambridge, Ma.).

Results

[0068] Characteristics of patients evaluated by RT-PCR. The age ofpatients ranged from 37 to 85 y (68.1±9.5 y). The ages of females(range=52-85 y; 64.5±10.5 y) and males (range=37-82 y; 70.9±7.8 y) weresimilar. The ratio of males to females was balanced between control(8:9) and case (5:7) groups. One female patient was African-American;all other patients were Caucasian. The ratio of T₃ to T₄ disease was3:13 in the control group and 4:8 in the case group. Patients werefollowed for 9 to 105 months (67.4±30.7 months). Patients in the controlgroup were followed for 73 to 105 months (89.9±7.8 months) while thosein the case group were followed for 9 to 78 months (37.3±22.6 months).In the control group, one patient (6.3%) developed a new primary coloniclesion 96 months after initial diagnosis, one (6.3%) died of causesunrelated to colorectal cancer, and the remaining 14 (87.5%) were aliveand free of disease 88 (range, 73-97) months following diagnosis. In thecase group, 8 (66.6%) patients died of recurrent colorectal cancerfollowing intervals of disease-free and overall survival of 13 (range,3-35) and 19 (range, 9-64) months, respectively. Four (33%) were alivewith metastases following intervals of disease-free and overall survivalof 12 (range, 2-36) and 52 (range, 17-78) months, respectively.

[0069] RT-PCR analysis of RNA expression in lymph nodes. For the 28patients in the control and case groups, a total of 524 (18.4±12.5 lymphnodes/patient) lymph nodes collected at surgery were reported free oftumor by histologic review. The number of lymph nodes obtained from eachpatient at the time of initial operative staging was similar betweencontrol (19.9±13.2) and case (17.2±12.7) groups. Twenty-one patients(75%) yielded 159 paraffin-embedded lymph nodes (7.6±5.2 lymphnodes/patient) that could be adequately evaluated by RT-PCR. Lymph nodesomitted from RT-PCR analysis were not available from pathology (326lymph nodes from 28 patients; 62.2% of 524 lymph nodes obtained atsurgery) or did not yield RNA (39 lymph nodes from 7 patients; 7.4% of524 lymph nodes obtained at surgery; 19.7% of 198 lymph nodes availablefor RT-PCR analysis). The number of lymph nodes available for RT-PCRanalysis was balanced between control (6.4±3.0) and case (8.1±6.3)groups.

[0070] β-Actin-specific amplicons (an indicator of intact RNA) were notdetected in total RNA from pooled sections of lymph nodes of 5 (41.7%)patients from the case group and 2 (16.7%) patients from the controlgroup and these patients were excluded from further analysis. Total RNAextracted from pooled lymph node sections from the remaining 21 patientswas analyzed by RT-PCR using GCC-specific primers. GCC-specificamplicons were not detected in any reaction using RNA from lymph nodesof patients in the control group (p=0.004; Table 1). The absence ofGCC-specific amplicons in these reactions was confirmed by Southernanalysis and suggests the absence of colorectal cancer micrometastasesin lymph nodes of patients free of disease. In contrast, GCC-specificamplicons were detected in all reactions using RNA from lymph nodes ofpatients in the case group (Table 1). The presence of GCC-specificamplicons in these reactions was confirmed by sequencing and/or Southernanalyses and suggests the presence of colorectal cancer micrometastasesin lymph nodes of patients with recurrent disease. Of note, GCC mRNA wasnot expressed in any of 39 lymph nodes from 21 other patients withoutcolorectal cancer (negative controls) that have been analyzed by RT-PCRto date. TABLE 1 GCC mRNA expression in lymph nodes and patient outcome.Patient GCC mRNA* DFT^(†) OS^(§) Vital Status Controls  6 (−) 97 97Alive, NED^(¶)  7 (−) 96 105  Alive, New 1° Colon Cancer (T₃N₁M₁)  8 (−)96 96 Alive  9 (−) 82 82 Alive 10 (−) 86 86 Died of Dehydration 11 (−)89 89 Alive 12 (−) 94 94 Alive 13 (−) 87 87 Alive 14 (−) 86 86 Alive 15(−) 87 87 Alive 16 (−) 73 73 Alive Cases 17 (+) 13 15 Dead 2° to LiverMetastases 18 (+) 15 52 Dead 2° to Liver Metastases 19 (+)  3  9 Dead 2°to Liver Metastases 20 (+) 14 20 Dead 2° to Liver Metastases 21 (+)  278 Alive with Liver Metastases 22 (+) 12 25 Alive with Liver Metastases23 (+)  9 55 Dead 2° to Lung and CNS Metastases 24 (+) 29 64 Alive withLung and Bone Metastases 25 (+) 17 19 Dead 2° to Liver, Lung and BoneMetastases 26 (+) 11 17 Alive with Lung Metastases

[0071] Carcinoembryonic antigen is a glycoprotein expressed by <60% ofcolorectal cancers and by other tumors, normal cells, and in somenon-malignant pathological conditions. RT-PCR analysis ofcarcinoembryonic antigen expression has been suggested to be a marker ofcolorectal cancer micrometastases in lymph nodes. In the present study,total RNA extracted from pooled lymph node sections was analyzed byRT-PCR using carcinoembryonic antigen-specific primers (Liefers et al.,1998, New Engl J Med 339:223-8). Nested RT-PCR failed to yieldCEA-specific amplicons in reactions using total RNA from patients in thecontrol group, but detected carcinoembryonic antigen-specific ampliconsin 1 patient in the case group. The presence of carcinoembryonicantigen-specific amplicons was confirmed by sequence analysis.

[0072] GCC mRNA expression in lymph nodes and clinicopathologicalprognostic indicators. Case and control groups (28 patients) werecompared for tumor and disease characteristics associated with diseaserecurrence. Groups appeared balanced with respect to: tumor grade (welldifferentiated: control, 2 (12.5%); case, 1 (8.3%); moderatelydifferentiated: control, 13 (81.3%); case, 9 (75%); poorlydifferentiated: control, 1 (8.3%); case, 2 (12.5%); tumor size (control,5.7±2.3 cm; case, 4.8±1.7 cm); tumor location (right colon: control, 7(43.8%); case, 4 (33.3%); transverse colon: control, 3 (18.8%); case, 0;sigmoid colon: control, 5 (31.3%); case, 8 (66.6%); rectum: control, 1(6.3%), case, 0); and depth of penetration and extension into pericolicfat of tumors. Angiolymphatic invasion was observed in 3 patients in thecase group but not in patients in the control group, reflecting a likelymechanism underlying metastasis in the former. Expression of GCC MRNA inlymph nodes was associated with disease recurrence in all cases(p=0.004). The odds ratio for mortality associated with GCC MRNAexpression in regional lymph nodes was 16.5 (1.1-756.7, 95% CI).Sensitivity analysis demonstrated that an incremental “false negative”(death of a patient in the control group) or “false positive” (survivalof a patient in the case group) result would yield an odds ration with a95% confidence interval encompassing 1 (no excess risk), reflecting thelimitations of the small sample population employed in this analysis.

1 6 1 22 DNA Homo sapiens 1 gcccatagct ctgacctttc tg 22 2 24 DNA Homosapiens 2 agagagatta gctgggcctc accc 24 3 34 DNA Homo sapiens 3cagctaatct ctctgtttat agctctgacc tttc 34 4 35 DNA Homo sapiens 4atctctctgt ttatagctct gacctttctg ggtgc 35 5 34 DNA Homo sapiens 5cagctaatct ctctgcccat agctctgacc tttc 34 6 38 DNA Homo sapiens 6gatccggctg gtgagggtgc aataaaactt tatgagta 38

1. A for identifying a molecular marker useful for detecting tumor cellsmetastasized from an orgin tissue to a destination tissue or fluid,comprising the steps of: a) down-regulating in a population of orgintissue cells the activity of a transcription factor associated withterminally differentiated origin tissue; b) comparing an expressionprofile of the population down-regulated origin cells with theexpression profile a population of control origin cells; c) identifyingcandidate markers which are expressed in the population of controlorigin cells but not in the population of down-regulated origin cells;and d)comparing expression of candidate markers in control population oforigin cells cancerous population of origin cells and population ofdestination cells wherein a candidate marker that is express in thepopulation of control origin cells and the population of cancerousorigin cells and not in the population of destination cells is useful asa molecular marker for the detection of cancer metastasized from theorgin tissue to the destination tissue or fluid.
 2. The method of claim1 wherein the activity of the transcription factor is down-regulated bya method selected from the group consisting of down-regulating thetranscription factor gene, down-regulating the activity of thetranscription factor and activating a signaling event that inactivatesthe transcription factor.
 3. The method of claim 1 wherein thepopulation of down-regulated origin cells is derive from a cdx2-nullintestinal polyp.
 4. The method of claim 1 wherein the molecular markeris a polynucleic acid and the expression profiles are compared by atechnique selected from the group consisting of differential display,subtractive hybridization, expression array,Serial Analysis of GeneExpression (SAGE), Rapid Analysis of Gene Expression (RAGE), MassivelyParallel Signature Sequencing (MPSS) and Tandem Arrayed Ligation ofExpressed Sequence Tags (TALEST).
 5. The method of claim 1 wherein themolecular marker is a protein and the expression profiles are comparedby a technique selected from the group consisting of 2-D gelelectrophoresis and Isotope-Coded Affinity Tags (ICAT).
 6. The method ofclaim 1 wherein the origin tissue and destination tissue are mammalian.7. The method of claim 6 wherein the origin tissue and destinationtissue are human.
 8. The method of claim 1 wherein the control origincells are from an origin tissue which is selected from the groupconsisting of colorectal, intestine, stomach, liver, mouth, esophagus,throat, thyroid, skin, brain, kidney, pancreas, breast, cervix, ovary,uterus, testicle, prostate, bone, muscle, bladder and lung.
 9. Themethod of claim 1 wherein the population of control origin cells are acell line selected from the group consisting of T84, Caco2, HT29, SW480,SW620, NCI H508, SW1116, SW1463, Hep G2, and HeLa.
 10. The method ofclaim 1 wherein the cancerous origin cells are cancer cells from tissueselected from the group consisting of colon, stomach, liver, throat,thyroid, skin, brain and lung.
 11. The method of claim 1 wherein thepopulation of cancerous origin cells are a cell line selected from thegroup consisting of T84, Caco2, HT29, SW480, SW620, NCI H508, SW 1116,SW1463, Hep G2, and HeLa.
 12. The method of claim 1 wherein thedestination tissue or body fluid is selected from the group consistingof lymph node, blood, cerebral spinal fluid, and bone marrow.
 13. Themethod of claim 1 wherein the transcriptional factor is selected fromthe group consisting of Cdx2, STAT5, NKX3.1, FREAC-1, FREAC-2, Pit1,HNF4, LFB1, IPF1, Is11 and MyoD.
 14. The method of claim 1 whichcomprises the additional step of isolating the molecular marker of stepd.
 15. The method of claim 1 wherein the transcription factor gene isisolated by the steps of a) isolating a transcription factor that bindsto the regulatory regions of a gene associated with terminaldifferentiation of the origin tissue; and b) isolating the gene thatexpresses the transcription factor.